Download Modulation And Delay Line Based Digital Audio Effects
In the field of musicians and recording engineers audio effects are mainly described and indicated by their acoustical effect. Audio effects can also be categorized from a technical point of view. The main criterion is found to be the type of modulation technique used to achieve the effect. After a short introduction to the different modulation types, three more sophisticated audio effect applications are presented, namely single sideband domain vibrato (mechanical vibrato bar simulation), a rotary speaker simulation, and an enhanced pitch transposing scheme.
Download An amplitude- and frequency-modulation vocoder for audio signal processing
The decomposition of audio signals into perceptually meaningful modulation components is highly desirable for the development of new audio effects on the one hand and as a building block for future efficient audio compression algorithms on the other hand. In the past, there has always been a distinction between parametric coding methods and waveform coding: While waveform coding methods scale easily up to transparency (provided the necessary bit rate is available), parametric coding schemes are subjected to the limitations of the underlying source models. Otherwise, parametric methods usually offer a wealth of manipulation possibilities which can be exploited for application of audio effects, while waveform coding is strictly limited to the best as possible reproduction of the original signal. The analysis/synthesis approach presented in this paper is an attempt to show a way to bridge this gap by enabling a seamless transition between both approaches.
Download An iterative Segmentation Algorithm for Audio Signal Spectra Depending on Local Centers of Gravity
Modern music production and sound generation often relies on manipulation of pre-recorded pieces of audio, so-called samples, taken from a huge database. Consequently, there is a increasing request to extensively adapt these samples to any new musical context in a flexible way. For this purpose, advanced digital signal processing is needed in order to realize audio effects like pitch shifting, time stretching or harmonization. Often, a key part of these processing methods is a signal adaptive, block based spectral segmentation operation. Hence, we propose a novel algorithm for such a spectral segmentation based on local centers of gravity (COG). The method was originally developed as part of a multiband modulation decomposition for audio signals. Nevertheless, this algorithm can also be used in the more general context of improved vocoder related applications.
Download An Enhanced Modulation Vocoder for Selective Transposition of Pitch
In previous papers, the concept of the modulation vocoder (MODVOC) has been introduced and its general capability to perform a selective transposition on polyphonic music content has been pointed out. This renders applications possible which aim at changing the key mode of pre-recorded PCM music samples. In this paper, two enhancement techniques for selective pitch transposition by the MODVOC are proposed. The performance of the selective transposition application and the merit of these techniques are benchmarked by results obtained from a specially designed listening test methodology which is capable to govern extreme changes in terms of pitch with respect to the original audio stimuli. Results of this subjective perceptual quality assessment are presented for items that have been converted between minor and major key mode by the MODVOC and, additionally, by the first commercially available software which is also capable of handling this task.
Download Assessing Applause Density Perception Using Synthesized Layered Applause Signals
Applause signals are the sound of many persons gathered in one place clapping their hands and are a prominent part of live music recordings. Usually, applause signals are recorded together or alongside with the live performance and serve to evoke the feeling of participation in a real event within the playback recipient. Applause signals can be very different in character, depending on the audience size, location, event type, and many other factors. To characterize different types of applause signals, the attribute of ‘density’ appears to be suitable. This paper reports first investigations whether density is an adequate perceptual attribute to describe different types of applause. We describe the design of a listening test assessing density and the synthesis of suitable, strictly controlled stimuli for the test. Finally, we provide results, both on strictly controlled and on naturally recorded stimuli, that confirm the suitability of the attribute density to describe important aspects of the perception of different applause signal characteristics.
Download Blind Upmix for Applause-like Signals Based on Perceptual Plausibility Criteria
Applause is the result of many individuals rhythmically clapping their hands. Applause recordings exhibit a certain temporal, timbral and spatial structure: claps originating from a distinct direction (i.e, from a particular person) usually have a similar timbre and occur in a quasi-periodic repetition. Traditional upmix approaches for blind mono-to-stereo upmix do not consider these properties and may therefore produce an output with suboptimal perceptual quality to be attributed to a lack of plausibility. In this paper, we propose a blind upmixing approach of applause-like signals which aims at preserving the natural structure of applause signals by incorporating periodicity and timbral similarity of claps into the upmix process and therefore supporting plausibility of the artificially generated spatial scene. The proposed upmix approach is evaluated by means of a subjective preference listening test.
Download Decorrelation for Immersive Audio Applications and Sound Effects
Audio decorrelation is a fundamental building block for immersive audio applications. It has applications in parametric spatial audio coding, audio upmix, audio sound effects and audio rendering for virtual or augmented reality applications. In this paper, we provide insights into the practical design considerations of an audio decorrelator on the example of the decorrelator contained within the upcoming MPEG-I Immersive Audio ISO standard [1]. We describe the desirable properties of such a decorrelator, common approaches for implementation and our particular technology choices for the decorrelator used in MPEG-I for rendering sound sources with homogeneous extent.